Univariate TS Models (ARIMA/SARIMA)

Code
canada <- read.csv("data/clean_data/24monthly_canada_freight.csv")
canada <- canada[,c("Date","Value")]
canada_ts <- ts(canada$Value, start = decimal_date(as.Date("2006-01-01", format = "%Y-%m-%d")), end = decimal_date(as.Date("2023-01-01", format = "%Y-%m-%d")), frequency = 12)


employment <- read.csv("data/clean_data/32monthly_employment.csv")
employment <- employment[,c("Date","Transportation.Employment...Air.Transportation")]
employment_ts <- ts(employment$Transportation.Employment...Air.Transportation, start = decimal_date(as.Date("2005-01-01", format = "%Y-%m-%d")), end = decimal_date(as.Date("2024-01-01", format = "%Y-%m-%d")), frequency = 12)


tsi <- read.csv("data/clean_data/35monthly_TSI.csv")
tsi_ts <- ts(tsi$Transportation.Services.Index...Freight, start = decimal_date(as.Date("2000-01-01", format = "%Y-%m-%d")), end = decimal_date(as.Date("2023-01-01", format = "%Y-%m-%d")), frequency = 12)


air <- read.csv("data/clean_data/33revenue.csv")
air <- air[air$Mode=="Air carrier, domestic",c("Year","Value")]
air <- air[order(air$Year), ]
air_ts <- ts(air$Value, start = decimal_date(as.Date("2000-01-01", format = "%Y-%m-%d")), end = decimal_date(as.Date("2021-01-01", format = "%Y-%m-%d")), frequency = 1)


ups <- getSymbols("UPS",auto.assign = FALSE, from = "2017-01-01", to = "2024-01-01",src="yahoo") 
ups=data.frame(ups)
ups <- data.frame(ups,rownames(ups))
colnames(ups)[7] = "date"
ups$date<-as.Date(ups$date,"%Y-%m-%d")
ups_ts <- ts(ups$UPS.Adjusted, start = decimal_date(as.Date("2017-01-03", format = "%Y-%m-%d")), frequency = 365.25)

ACF & PACF Plots

Code
plot1<-ggAcf(canada_ts)+ggtitle("U.S.-Canada Freight Value ACF") + theme_bw()
plot2<- ggPacf(canada_ts)+theme_bw()+ggtitle("U.S.-Canada Freight Value PACF")

grid.arrange(plot1, plot2,nrow=2)

Based on the ACF and PACF plots, we observe strong correlations at the beginning lags, gradually decreasing but remaining relatively strong until lag 12 in the ACF plot. In contrast, the PACF plot also exhibits strong correlation at lag 1, with moderate correlation extending until lag 13. The presence of significant correlations in both plots suggests that the time series data is non-stationary.

Code
plot1<-ggAcf(employment_ts)+ggtitle("U.S. Air Transportation Employment ACF") + theme_bw()
plot2<- ggPacf(employment_ts)+theme_bw()+ggtitle("U.S. Air Transportation Employment PACF")

grid.arrange(plot1, plot2,nrow=2)

Based on the ACF plot, there is strong correlation observed from lag 1 to lag 16, with the correlation gradually decreasing but remaining relatively strong throughout. Similarly, the PACF plot shows strong correlation at lag 1 and lag 2. Given the presence of significant correlations in both plots, we can infer that the series is likely non-stationary.

Code
plot1<-ggAcf(tsi_ts)+ggtitle("U.S. Freight Transportation Services Index ACF") + theme_bw()
plot2<- ggPacf(tsi_ts)+theme_bw()+ggtitle("U.S. Freight Transportation Services Index PACF")

grid.arrange(plot1, plot2,nrow=2)

Based on the ACF plot, there is a strong correlation observed at lag 1, with the correlation slightly decreasing but remaining relatively strong until the end of the plot. Similarly, the PACF plot shows strong correlation at lag 1. Given the presence of significant correlations in both plots, we can infer that the series is likely non-stationary.

Code
plot1<-ggAcf(air_ts)+ggtitle("U.S. Domestic Air Carrier Average Freight Revenue ACF") + theme_bw()
plot2<- ggPacf(air_ts)+theme_bw()+ggtitle("U.S. Domestic Air Carrier Average Freight Revenue PACF")

grid.arrange(plot1, plot2,nrow=2)

Based on the description provided, the ACF plot demonstrates strong correlation at lag 1 and moderate correlation at lag 2 and lag 3. Additionally, the PACF plot exhibits strong correlation only at lag 1. Given the presence of significant correlations in both plots, particularly at lag 1, and moderate correlation at subsequent lags, we can infer that the series is likely non-stationary.

Code
plot1<-ggAcf(ups_ts, lag=100)+ggtitle("UPS Stock Price ACF") + theme_bw()
plot2<- ggPacf(ups_ts, lag=100)+theme_bw()+ggtitle("UPS Stock Price PACF")

grid.arrange(plot1, plot2,nrow=2)

Based on the provided information, it appears that the ACF plot shows strong correlation beginning at lag 1 and then slightly decreasing but remaining strong until the end. Additionally, the PACF plot exhibits strong correlation only at lag 1. Given the presence of significant correlations in both plots, particularly at lag 1, and the sustained autocorrelation observed in the ACF plot, we can infer that the series is likely non-stationary.

Adjusted Dickey-Fuller Test On Original Data

Code
tseries::adf.test(canada_ts)

    Augmented Dickey-Fuller Test

data:  canada_ts
Dickey-Fuller = -3.0511, Lag order = 5, p-value = 0.1356
alternative hypothesis: stationary

With a p-value higher than 0.05, there is insufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is non-stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating non-stationarity.

Code
tseries::adf.test(employment_ts)

    Augmented Dickey-Fuller Test

data:  employment_ts
Dickey-Fuller = -3.1311, Lag order = 6, p-value = 0.1008
alternative hypothesis: stationary

With a p-value higher than 0.05, there is insufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is non-stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating non-stationarity.

Code
tseries::adf.test(tsi_ts)

    Augmented Dickey-Fuller Test

data:  tsi_ts
Dickey-Fuller = -2.4166, Lag order = 6, p-value = 0.4005
alternative hypothesis: stationary

With a p-value higher than 0.05, there is insufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is non-stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating non-stationarity.

Code
tseries::adf.test(air_ts)

    Augmented Dickey-Fuller Test

data:  air_ts
Dickey-Fuller = -0.85187, Lag order = 2, p-value = 0.9424
alternative hypothesis: stationary

With a p-value higher than 0.05, there is insufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is non-stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating non-stationarity.

Code
tseries::adf.test(ups_ts)

    Augmented Dickey-Fuller Test

data:  ups_ts
Dickey-Fuller = -2.2106, Lag order = 12, p-value = 0.4892
alternative hypothesis: stationary

With a p-value higher than 0.05, there is insufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is non-stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating non-stationarity.

Differencing

Code
require(gridExtra)
fit = lm(canada_ts~time(canada_ts), na.action=NULL)

plot1<-autoplot(diff(canada_ts), main="First Difference") +theme_bw()
plot2 <- ggAcf(diff(canada_ts), 48) + ggtitle("First Difference Data")+theme_bw()

grid.arrange(plot1, plot2,nrow=2)

The differenced data exhibits greater stationarity compared to the originaldata. This improvement is evident in the ACF plot of the differenced data, which shows a significant drop-off, indicating a substantial reduction in autocorrelation beyond those lags. However, despite the improvement, the first order difference still shows some seasonality in the plot, suggesting that further differencing may be necessary to fully address the seasonality present in the data.

Code
plot1 <- ggAcf(diff(canada_ts), 48) + ggtitle("First Difference Data")+theme_bw()
plot2 <- ggAcf(diff(diff(canada_ts)), 48) + ggtitle("Second Differenced Data") + theme_bw()

grid.arrange(plot1, plot2, nrow=2)

After applying a second-order difference, the ACF plot shows even stronger autocorrelation. The second differenced ACF plot shows that the data be over differenced. Consequently, I have opted to retain the first-order difference data, as it achieves a satisfactory level of stationarity.

Code
require(gridExtra)
fit1 = lm(employment_ts~time(employment_ts), na.action=NULL)

plot1<-autoplot(diff(employment_ts), main="First Difference") +theme_bw()
plot2 <- ggAcf(diff(employment_ts), 48) + ggtitle("First Difference Data")+theme_bw()

grid.arrange(plot1, plot2,nrow=2)

The differenced data exhibits greater stationarity compared to the original data. This improvement is evident in the ACF plot of the differenced data, which shows a significant drop-off, indicating a substantial reduction in autocorrelation beyond those lags. However, despite the improvement, the first order difference still shows some seasonality in the plot, suggesting that further differencing may be necessary to fully address the seasonality present in the data.

Code
plot1 <- ggAcf(diff(employment_ts), 48) + ggtitle("First Difference Data")+theme_bw()
plot2 <- ggAcf(diff(diff(employment_ts)), 48) + ggtitle("Second Differenced Data") + theme_bw()

grid.arrange(plot1, plot2, nrow=2)

The plot above clearly demonstrates that Second Order Differencing effectively renders the data stationary, which is a crucial prerequisite for accurate modeling.

Code
require(gridExtra)
fit2 = lm(tsi_ts~time(tsi_ts), na.action=NULL)

plot1<- autoplot(diff(tsi_ts), main="First Difference") +theme_bw()
plot2 <- ggAcf(diff(tsi_ts), 48) + ggtitle("First Difference Data")+theme_bw()

grid.arrange(plot1, plot2,nrow=2)

From both the original plots and the ACF plots, it’s evident that the differenced data exhibits greater stationarity compared to the original data. The ACF plot of the differenced data shows a significant drop-off, indicating a lack of autocorrelation beyond those lags, which is characteristic of stationary data.

Code
require(gridExtra)
fit3 = lm(air_ts~time(air_ts), na.action=NULL)

plot1<-autoplot(diff(air_ts), main="First Difference") +theme_bw()
plot2 <- ggAcf(diff(air_ts), 48) + ggtitle("First Difference Data")+theme_bw()

grid.arrange(plot1, plot2,nrow=2)

The differenced data exhibit greater stationarity compared to the original data. This improvement is evident in the ACF plot of the differenced data, which show a significant drop-off, indicating a lack of autocorrelation beyond those lags. This drop-off is characteristic of stationary data.

Code
require(gridExtra)
fit4 = lm(ups_ts~time(ups_ts), na.action=NULL)

plot1<-autoplot(diff(ups_ts), main="First Difference") +theme_bw()
plot2 <- ggAcf(diff(ups_ts), 48) + ggtitle("First Difference Data")+theme_bw()

grid.arrange(plot1, plot2,nrow=2)

The differenced data exhibits greater stationarity compared to the original. This improvement is evident in the ACF plot of the differenced data, which shows a significant drop-off, indicating a substantial reduction in autocorrelation beyond those lags. However, despite the improvement, the First Order Difference still shows some seasonality in the plot, suggesting that further differencing may be necessary to fully address the seasonality present in the data.

Code
plot1 <- ggAcf(diff(ups_ts), 48) + ggtitle("First Difference Data")+theme_bw()
plot2 <- ggAcf(diff(diff(ups_ts)), 48) + ggtitle("Second Differenced Data") + theme_bw()

grid.arrange(plot1, plot2, nrow=2)

After performing a second-order difference, strong autocorrelation is observed at lag 1 in the ACF plot. The second differenced ACF plot shows that the data be over differenced. Therefore, I have decided to retain the first-order difference data, as it achieves a satisfactory level of stationarity.

Adjusted Dickey-Fuller Test On Differencing Data

Code
tseries::adf.test(diff(canada_ts))

    Augmented Dickey-Fuller Test

data:  diff(canada_ts)
Dickey-Fuller = -6.549, Lag order = 5, p-value = 0.01
alternative hypothesis: stationary

With a p-value smaller than 0.05, there is sufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating stationarity.

Code
tseries::adf.test(diff(diff(employment_ts)))

    Augmented Dickey-Fuller Test

data:  diff(diff(employment_ts))
Dickey-Fuller = -9.6886, Lag order = 6, p-value = 0.01
alternative hypothesis: stationary

With a p-value smaller than 0.05, there is sufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating stationarity.

Code
tseries::adf.test(diff(tsi_ts))

    Augmented Dickey-Fuller Test

data:  diff(tsi_ts)
Dickey-Fuller = -5.025, Lag order = 6, p-value = 0.01
alternative hypothesis: stationary

With a p-value smaller than 0.05, there is sufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating stationarity.

Code
tseries::adf.test(diff(air_ts))

    Augmented Dickey-Fuller Test

data:  diff(air_ts)
Dickey-Fuller = -2.1674, Lag order = 2, p-value = 0.5086
alternative hypothesis: stationary

With a p-value higher than 0.05, there is insufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is non-stationary. This conclusion againest with the earlier analysis, where lack of autocorrelation was observed in the ACF and PACF plots, indicating the data almost stationary.

Code
tseries::adf.test(diff(ups_ts))

    Augmented Dickey-Fuller Test

data:  diff(ups_ts)
Dickey-Fuller = -11.498, Lag order = 12, p-value = 0.01
alternative hypothesis: stationary

With a p-value smaller than 0.05, there is sufficient evidence to reject the null hypothesis at the 5% significance level. Consequently, we can conclude that our series is stationary. This conclusion aligns with the earlier analysis, where strong autocorrelation was observed in the ACF and PACF plots, indicating stationarity.

ARIMA(p,d,q)

In this section, we will identify potential values for the parameters p and q based on the significant lags observed in the PACF and ACF plots of the original data, respectively. Specifically, we will consider the most significant lags from the PACF plot for the value of p, and from the ACF plot for the value of q.

p = 0,1,2, d = 0,1,2, q = 0,1,2,3,4,5

Code
## empty list to store model fits

ARMA_res <- list()

## set counter
cc <-1

## loop over AR
for(p in 0:2){
  ## loop over MA
  for(q in 0:5){
    ## loop over I
    for(d in 0:2){
    
      ARMA_res[[cc]]<-Arima(y=canada_ts,order = c(p,d,q), include.drift = TRUE)
      cc<- cc+1
    }
  }
}

## get AIC values for model evaluation
ARMA_AIC<-sapply(ARMA_res,function(x) x$aic)
ARMA_res[[which(ARMA_AIC == min(ARMA_AIC))]]
Series: canada_ts 
ARIMA(2,2,3) 

Coefficients:
          ar1      ar2     ma1      ma2     ma3
      -1.7310  -0.9992  0.7457  -0.7457  -1.000
s.e.   0.0025   0.0013  0.0251   0.0242   0.026

sigma^2 = 9.987:  log likelihood = -525.66
AIC=1063.32   AICc=1063.74   BIC=1083.19
Code
## get BIC values for model evaluation
ARMA_BIC<-sapply(ARMA_res,function(x) x$bic)
ARMA_res[[which(ARMA_BIC == min(ARMA_BIC))]]
Series: canada_ts 
ARIMA(2,2,3) 

Coefficients:
          ar1      ar2     ma1      ma2     ma3
      -1.7310  -0.9992  0.7457  -0.7457  -1.000
s.e.   0.0025   0.0013  0.0251   0.0242   0.026

sigma^2 = 9.987:  log likelihood = -525.66
AIC=1063.32   AICc=1063.74   BIC=1083.19

The model with the lowest AIC and BIC is ARIMA(2,2,3).

p = 0,1,2, d = 0,1,2, q = 0,1,2,3,4,5

Code
## empty list to store model fits

ARMA_res <- list()

## set counter
cc <-1

## loop over AR
for(p in 0:2){
  ## loop over MA
  for(q in 0:5){
    ## loop over I
    for(d in 0:2){
    
      ARMA_res[[cc]]<-Arima(y=employment_ts,order = c(p,d,q), include.drift = TRUE)
      cc<- cc+1
    }
  }
}

## get AIC values for model evaluation
ARMA_AIC<-sapply(ARMA_res,function(x) x$aic)
ARMA_res[[which(ARMA_AIC == min(ARMA_AIC))]]
Series: employment_ts 
ARIMA(2,2,3) 

Coefficients:
         ar1      ar2      ma1     ma2      ma3
      0.4253  -0.5218  -0.9902  0.3708  -0.3806
s.e.  0.1897   0.1252   0.1912  0.1990   0.1703

sigma^2 = 40180919:  log likelihood = -2309.48
AIC=4630.96   AICc=4631.35   BIC=4651.51
Code
## get BIC values for model evaluation
ARMA_BIC<-sapply(ARMA_res,function(x) x$bic)
ARMA_res[[which(ARMA_BIC == min(ARMA_BIC))]]
Series: employment_ts 
ARIMA(2,2,1) 

Coefficients:
         ar1      ar2      ma1
      0.4688  -0.2364  -1.0000
s.e.  0.0644   0.0643   0.0143

sigma^2 = 40643462:  log likelihood = -2311.73
AIC=4631.45   AICc=4631.63   BIC=4645.15
Code
## get AICC values for model evaluation
ARMA_AICC<-sapply(ARMA_res,function(x) x$aicc)
ARMA_res[[which(ARMA_AICC == min(ARMA_AICC))]]
Series: employment_ts 
ARIMA(2,2,3) 

Coefficients:
         ar1      ar2      ma1     ma2      ma3
      0.4253  -0.5218  -0.9902  0.3708  -0.3806
s.e.  0.1897   0.1252   0.1912  0.1990   0.1703

sigma^2 = 40180919:  log likelihood = -2309.48
AIC=4630.96   AICc=4631.35   BIC=4651.51

The model with the lowest AIC and BIC is ARIMA(2,2,3).

p = 0,1, d = 0,1, q = 0,1,2,3,4,5

Code
## empty list to store model fits

ARMA_res <- list()

## set counter
cc <-1

## loop over AR
for(p in 0:1){
  ## loop over MA
  for(q in 0:5){
    ## loop over I
    for(d in 0:1){
    
      ARMA_res[[cc]]<-Arima(y=tsi_ts,order = c(p,d,q), include.drift = TRUE)
      cc<- cc+1
    }
  }
}

## get AIC values for model evaluation
ARMA_AIC<-sapply(ARMA_res,function(x) x$aic)
ARMA_res[[which(ARMA_AIC == min(ARMA_AIC))]]
Series: tsi_ts 
ARIMA(0,1,3) with drift 

Coefficients:
          ma1      ma2     ma3   drift
      -0.0700  -0.0927  0.1035  0.1127
s.e.   0.0608   0.0580  0.0578  0.0837

sigma^2 = 2.217:  log likelihood = -499.51
AIC=1009.03   AICc=1009.25   BIC=1027.13
Code
## get BIC values for model evaluation
ARMA_BIC<-sapply(ARMA_res,function(x) x$bic)
ARMA_res[[which(ARMA_BIC == min(ARMA_BIC))]]
Series: tsi_ts 
ARIMA(0,1,0) with drift 

Coefficients:
       drift
      0.1141
s.e.  0.0901

sigma^2 = 2.246:  log likelihood = -502.8
AIC=1009.61   AICc=1009.65   BIC=1016.85
Code
## get AICC values for model evaluation
ARMA_AICC<-sapply(ARMA_res,function(x) x$aicc)
ARMA_res[[which(ARMA_AICC == min(ARMA_AICC))]]
Series: tsi_ts 
ARIMA(0,1,3) with drift 

Coefficients:
          ma1      ma2     ma3   drift
      -0.0700  -0.0927  0.1035  0.1127
s.e.   0.0608   0.0580  0.0578  0.0837

sigma^2 = 2.217:  log likelihood = -499.51
AIC=1009.03   AICc=1009.25   BIC=1027.13

The model with the lowest AIC and BIC is ARIMA(0,1,3).

p = 0,1, d = 0,1, q = 0,1,2,3

Code
## empty list to store model fits

ARMA_res <- list()

## set counter
cc <-1

## loop over AR
for(p in 0:1){
  ## loop over MA
  for(q in 0:3){
    ## loop over I
    for(d in 0:1){
    
      ARMA_res[[cc]]<-Arima(y=air_ts,order = c(p,d,q), include.drift = TRUE)
      cc<- cc+1
    }
  }
}

## get AIC values for model evaluation
ARMA_AIC<-sapply(ARMA_res,function(x) x$aic)
ARMA_res[[which(ARMA_AIC == min(ARMA_AIC))]]
Series: air_ts 
ARIMA(0,1,1) with drift 

Coefficients:
         ma1    drift
      1.0000  -0.1510
s.e.  0.2724   5.1467

sigma^2 = 161:  log likelihood = -83.65
AIC=173.3   AICc=174.71   BIC=176.43
Code
## get BIC values for model evaluation
ARMA_BIC<-sapply(ARMA_res,function(x) x$bic)
ARMA_res[[which(ARMA_BIC == min(ARMA_BIC))]]
Series: air_ts 
ARIMA(0,1,1) with drift 

Coefficients:
         ma1    drift
      1.0000  -0.1510
s.e.  0.2724   5.1467

sigma^2 = 161:  log likelihood = -83.65
AIC=173.3   AICc=174.71   BIC=176.43

The model with the lowest AIC and BIC is ARIMA(0,1,1).

p = 0,1, d = 0,1,2, q = 0,1,2,3,4,5

Code
## empty list to store model fits

ARMA_res <- list()

## set counter
cc <-1

## loop over AR
for(p in 0:1){
  ## loop over MA
  for(q in 0:5){
    ## loop over I
    for(d in 0:2){
    
      ARMA_res[[cc]]<-Arima(y=ups_ts,order = c(p,d,q), include.drift = TRUE)
      cc<- cc+1
    }
  }
}

## get AIC values for model evaluation
ARMA_AIC<-sapply(ARMA_res,function(x) x$aic)
ARMA_res[[which(ARMA_AIC == min(ARMA_AIC))]]
Series: ups_ts 
ARIMA(0,1,0) with drift 

Coefficients:
       drift
      0.0366
s.e.  0.0557

sigma^2 = 5.465:  log likelihood = -3989.16
AIC=7982.32   AICc=7982.33   BIC=7993.26
Code
## get BIC values for model evaluation
ARMA_BIC<-sapply(ARMA_res,function(x) x$bic)
ARMA_res[[which(ARMA_BIC == min(ARMA_BIC))]]
Series: ups_ts 
ARIMA(0,1,0) with drift 

Coefficients:
       drift
      0.0366
s.e.  0.0557

sigma^2 = 5.465:  log likelihood = -3989.16
AIC=7982.32   AICc=7982.33   BIC=7993.26

The model with the lowest AIC and BIC is ARIMA(0,1,0).

Fit Best ARIMA(p,d,q) & Diagnostics

Code
ARMA_res_canada<-Arima(y=canada_ts,order = c(2,2,3), include.drift = TRUE)
summary(ARMA_res_canada)
Series: canada_ts 
ARIMA(2,2,3) 

Coefficients:
          ar1      ar2     ma1      ma2     ma3
      -1.7310  -0.9992  0.7457  -0.7457  -1.000
s.e.   0.0025   0.0013  0.0251   0.0242   0.026

sigma^2 = 9.987:  log likelihood = -525.66
AIC=1063.32   AICc=1063.74   BIC=1083.19

Training set error measures:
                      ME     RMSE      MAE        MPE     MAPE      MASE
Training set -0.02179075 3.105814 2.276102 -0.2721476 4.798624 0.3998011
                    ACF1
Training set -0.01487147

Equation: \(x_t= -1.731*x_{t-1} -0.9992*x_{t-2} +w_t +0.7457*w_{t-1} -0.7457*w_{t-2} -w_{t-3}\)

Code
output <- capture.output(sarima(canada_ts,2,2,3))

Code
cat(output[120:131], output[length(output)], sep = "\n") 
Coefficients: 
    Estimate     SE   t.value p.value
ar1  -1.7310 0.0025 -701.3924       0
ar2  -0.9992 0.0013 -759.3772       0
ma1   0.7457 0.0251   29.6534       0
ma2  -0.7457 0.0242  -30.8026       0
ma3  -1.0000 0.0260  -38.3988       0

sigma^2 estimated as 9.741094 on 198 degrees of freedom 
 
AIC = 5.238006  AICc = 5.239506  BIC = 5.335933 
 
 

Upon inspection of the time plot of the standardized residuals above, no discernible patterns are evident. However, the ACF plot of the standardized residuals indicates some remaining correlation. The normal Q-Q plot of the residuals suggests that the assumption of normality is reasonable, with the exception of potential outliers. The results of the Ljung-Box test reveal values below the 0.05 (5% significance) threshold, indicating the presence of some significant correlation remaining. Despite these findings, it appears that the model has been improved.

Code
ARMA_res_employment<-Arima(y=employment_ts,order = c(2,2,3), include.drift = TRUE)
summary(ARMA_res_employment)
Series: employment_ts 
ARIMA(2,2,3) 

Coefficients:
         ar1      ar2      ma1     ma2      ma3
      0.4253  -0.5218  -0.9902  0.3708  -0.3806
s.e.  0.1897   0.1252   0.1912  0.1990   0.1703

sigma^2 = 40180919:  log likelihood = -2309.48
AIC=4630.96   AICc=4631.35   BIC=4651.51

Training set error measures:
                   ME     RMSE      MAE        MPE      MAPE      MASE
Training set 263.3125 6241.208 2809.261 0.05105339 0.6005098 0.1456793
                    ACF1
Training set 0.009897766

Equation: \(x_t= 0.4253*x_{t-1} -0.5218*x_{t-2} +w_t -0.9902*w_{t-1} +0.3708*w_{t-2} -0.3806*w_{t-3}\)

Code
output <- capture.output(sarima(employment_ts,2,2,3))

Code
cat(output[55:66], output[length(output)], sep = "\n") 
Coefficients: 
    Estimate     SE t.value p.value
ar1   0.4253 0.1897  2.2413  0.0260
ar2  -0.5218 0.1252 -4.1690  0.0000
ma1  -0.9902 0.1912 -5.1783  0.0000
ma2   0.3708 0.1990  1.8633  0.0637
ma3  -0.3806 0.1703 -2.2346  0.0264

sigma^2 estimated as 39293604 on 222 degrees of freedom 
 
AIC = 20.40072  AICc = 20.40192  BIC = 20.49125 
 
 

Upon examining the time plot of the standardized residuals, no discernible patterns are evident. Additionally, the ACF plot of the standardized residuals reveals no significant departures from the model assumptions, as no significant lags are observed. The normal Q-Q plot of the residuals indicates that the assumption of normality is reasonable, with the exception of possible outliers. Furthermore, the Ljung-Box test yields favorable values exceeding 0.05, suggesting the absence of significant correlation and indicating a well-fitted model. Overall, these observations suggest that the model fits the data well.

Code
ARMA_res_tsi<-Arima(y=tsi_ts,order = c(0,1,3), include.drift = TRUE)
summary(ARMA_res_tsi)
Series: tsi_ts 
ARIMA(0,1,3) with drift 

Coefficients:
          ma1      ma2     ma3   drift
      -0.0700  -0.0927  0.1035  0.1127
s.e.   0.0608   0.0580  0.0578  0.0837

sigma^2 = 2.217:  log likelihood = -499.51
AIC=1009.03   AICc=1009.25   BIC=1027.13

Training set error measures:
                       ME     RMSE      MAE         MPE      MAPE      MASE
Training set 0.0007283871 1.475506 1.068367 -0.01205377 0.9267941 0.2634875
                    ACF1
Training set 0.004977272

Equation: \(x_t= w_t -0.07*w_{t-1} -0.0927*w_{t-2} +0.1035*w_{t-3} +0.1127\)

Code
output <- capture.output(sarima(tsi_ts,0,1,3))

Code
cat(output[21:31], output[length(output)], sep = "\n") 
Coefficients: 
         Estimate     SE t.value p.value
ma1       -0.0700 0.0608 -1.1505  0.2509
ma2       -0.0927 0.0580 -1.6003  0.1107
ma3        0.1035 0.0578  1.7915  0.0743
constant   0.1127 0.0837  1.3458  0.1795

sigma^2 estimated as 2.184965 on 272 degrees of freedom 
 
AIC = 3.655896  AICc = 3.656431  BIC = 3.721483 
 
 

Upon examining the time plot of the standardized residuals, no discernible patterns are evident. Additionally, the ACF plot of the standardized residuals reveals no significant departures from the model assumptions, as no significant lags are observed. The normal Q-Q plot of the residuals indicates that the assumption of normality is reasonable, with the exception of possible outliers. Furthermore, the Ljung-Box test yields favorable values exceeding 0.05, suggesting the absence of significant correlation and indicating a well-fitted model. Overall, these observations suggest that the model fits the data well.

Code
ARMA_res_air<-Arima(y=air_ts,order = c(0,1,1), include.drift = TRUE)
summary(ARMA_res_air)
Series: air_ts 
ARIMA(0,1,1) with drift 

Coefficients:
         ma1    drift
      1.0000  -0.1510
s.e.  0.2724   5.1467

sigma^2 = 161:  log likelihood = -83.65
AIC=173.3   AICc=174.71   BIC=176.43

Training set error measures:
                    ME     RMSE      MAE       MPE     MAPE      MASE      ACF1
Training set 0.2965719 11.79289 8.992931 0.1888945 9.662775 0.8915005 -0.264417

Equation: \(x_t= w_t +w_{t-1} -0.151\)

Code
output <- capture.output(sarima(air_ts,0,1,1))

Code
cat(output[30:38], output[length(output)], sep = "\n") 
Coefficients: 
         Estimate     SE t.value p.value
ma1         1.000 0.2724  3.6708  0.0016
constant   -0.151 5.1467 -0.0293  0.9769

sigma^2 estimated as 145.6943 on 19 degrees of freedom 
 
AIC = 8.252253  AICc = 8.283999  BIC = 8.40147 
 
 

Upon examining the time plot of the standardized residuals, no discernible patterns are evident. Additionally, the ACF plot of the standardized residuals reveals no significant departures from the model assumptions, as no significant lags are observed. The normal Q-Q plot of the residuals indicates that the assumption of normality is reasonable, with the exception of possible outliers. Furthermore, the Ljung-Box test yields favorable values exceeding 0.05, suggesting the absence of significant correlation and indicating a well-fitted model. Overall, these observations suggest that the model fits the data well.

Code
ARMA_res_ups<-Arima(y=ups_ts,order = c(0,1,0), include.drift = TRUE)
summary(ARMA_res_ups)
Series: ups_ts 
ARIMA(0,1,0) with drift 

Coefficients:
       drift
      0.0366
s.e.  0.0557

sigma^2 = 5.465:  log likelihood = -3989.16
AIC=7982.32   AICc=7982.33   BIC=7993.26

Training set error measures:
                       ME     RMSE      MAE         MPE     MAPE       MASE
Training set 5.173999e-05 2.336454 1.513771 -0.01543307 1.161839 0.04821098
                   ACF1
Training set 0.03201866

Equation: \(x_t= 0.0366\)

Code
output <- capture.output(sarima(ups_ts,0,1,0))

Code
cat(output[11:18], output[length(output)], sep = "\n") 
Coefficients: 
         Estimate     SE t.value p.value
constant   0.0366 0.0557  0.6569  0.5113

sigma^2 estimated as 5.462115 on 1758 degrees of freedom 
 
AIC = 4.537987  AICc = 4.537989  BIC = 4.54421 
 
 

Upon examining the time plot of the standardized residuals, no discernible patterns are evident. Additionally, the ACF plot of the standardized residuals reveals no significant departures from the model assumptions, as no significant lags are observed. The normal Q-Q plot of the residuals indicates that the assumption of normality is reasonable, with the exception of possible outliers. Furthermore, the Ljung-Box test yields favorable values exceeding 0.05, suggesting the absence of significant correlation and indicating a well-fitted model. Overall, these observations suggest that the model fits the data well.

Auto ARIMA

Code
auto.arima(canada_ts)
Series: canada_ts 
ARIMA(2,1,5)(1,0,0)[12] 

Coefficients:
          ar1      ar2     ma1     ma2     ma3      ma4     ma5    sar1
      -0.1410  -0.4216  0.1452  0.4440  0.0703  -0.3019  0.2020  0.5636
s.e.   0.1988   0.2128  0.1977  0.2187  0.0836   0.0907  0.1092  0.0718

sigma^2 = 8.695:  log likelihood = -508.85
AIC=1035.71   AICc=1036.63   BIC=1065.57

The best model provided by Auto ARIMA differs from the best model identified in the previous step. This discrepancy suggests that the Auto ARIMA model may offer a better fit compared to the previously chosen model, which was not optimal. It is plausible that this dataset exhibits strong seasonality, which ARIMA alone may struggle to capture adequately. Therefore, considering SARIMA models might be necessary to effectively model the seasonality present in the data.

Code
auto.arima(employment_ts)
Series: employment_ts 
ARIMA(0,1,1) 

Coefficients:
         ma1
      0.4204
s.e.  0.0536

sigma^2 = 41251784:  log likelihood = -2322.12
AIC=4648.25   AICc=4648.3   BIC=4655.11

The discrepancy between the best model provided by Auto ARIMA and the best model identified in the previous step raises concerns about its reliability. Auto ARIMA solely relies on minimizing AIC/BIC values without considering lag instances of high correlation as reflected in the ACF/PACF plots. This approach may result in overfitting and may not capture the underlying patterns in the data accurately.

Code
auto.arima(tsi_ts)
Series: tsi_ts 
ARIMA(3,1,0)(0,0,2)[12] with drift 

Coefficients:
          ar1      ar2     ar3     sma1     sma2   drift
      -0.0787  -0.0820  0.0791  -0.1773  -0.1487  0.1261
s.e.   0.0605   0.0614  0.0623   0.0634   0.0655  0.0560

sigma^2 = 2.129:  log likelihood = -493.41
AIC=1000.82   AICc=1001.24   BIC=1026.16

The best model provided by Auto ARIMA differs from the best model identified in the previous step. This discrepancy suggests that the Auto ARIMA model may offer a better fit compared to the previously chosen model, which was not optimal. It is plausible that this dataset exhibits strong seasonality, which ARIMA alone may struggle to capture adequately. Therefore, considering SARIMA models might be necessary to effectively model the seasonality present in the data.

Code
auto.arima(air_ts)
Series: air_ts 
ARIMA(0,1,0) 

sigma^2 = 213.5:  log likelihood = -86.12
AIC=174.23   AICc=174.44   BIC=175.28

The discrepancy between the best model provided by Auto ARIMA and the best model identified in the previous step raises concerns about its reliability. Auto ARIMA solely relies on minimizing AIC/BIC values without considering lag instances of high correlation as reflected in the ACF/PACF plots. This approach may result in overfitting and may not capture the underlying patterns in the data accurately.

Code
auto.arima(ups_ts)
Series: ups_ts 
ARIMA(0,1,0) 

sigma^2 = 5.463:  log likelihood = -3989.38
AIC=7980.75   AICc=7980.75   BIC=7986.22

The best model Auto ARIMA provide is the same with the best model from the step above.

Forecasting

Code
pred_canada <- forecast(ARMA_res_canada, 12)
autoplot(pred_canada, xlab = "Date", ylab = "Billions", colour = "blue")+ggtitle('U.S.-Canada Freight Value Forecast') +theme_bw()

The blue line represents the forecast, while the two shades of purple depict the confidence intervals, with the darker shade representing the 95% interval and the lighter shade representing the 5% interval. As the confidence interval widens, indicating greater uncertainty in the forecast, the shaded region becomes wider.

Examining the forecast plot, there is an initial surge in the monthly U.S.-Canada freight value, followed by a subsequent decline and another increase. Comparing the observed increasing seasonal fluctuations post-2020 to the forecasted fluctuations, it appears that the model accurately predicted the anticipated pattern.

Code
pred_employment <- forecast(ARMA_res_employment, 12)
autoplot(pred_employment, xlab = "Date", ylab = "Employment", colour = "blue")+ggtitle('U.S. Air Transportation Employment Forecast') +theme_bw()

The forecast plot illustrates a subtle decreasing trend in air transportation employment over the next 12 months. The colored bands depict the 95% confidence interval. However, the forecast appears overly smooth and lacks fluctuations. This suggests that the model may not adequately capture the inherent variability or fluctuations in the data.

Code
pred_tsi <- forecast(ARMA_res_tsi, 12)
autoplot(pred_tsi, xlab = "Date", ylab = "Transportation Services Index", colour = "blue")+ggtitle('U.S. Freight Transportation Services Index Forecast') +theme_bw()

The forecast plot indicates an ascending trend in the U.S. freight TSI over the ensuing 12 months. The colored bands delineate the 95% confidence interval. Nonetheless, the forecast appears excessively smooth, lacking the expected fluctuations. This suggests that the model might not adequately capture the inherent variability or fluctuations in the data.

Code
pred_air <- forecast(ARMA_res_air, 5)
autoplot(pred_air, xlab = "Date", ylab = "Freight revenue per ton-mile (current cents)", colour = "blue")+ggtitle('U.S. Domestic Air Carrier Average Freight Revenue Forecast') +theme_bw()

The forecast plot depicts a stagnant trend in the U.S. domestic air carrier average freight revenue over the upcoming five years. The colored bands represent the 95% confidence interval. However, the forecast appears excessively smooth, lacking any noticeable fluctuations. This suggests that the model might not adequately capture the inherent variability or fluctuations in the data.

Code
pred_ups <- forecast(ARMA_res_ups, 50)
autoplot(pred_ups, xlab = "Date", ylab = "Stock Price", colour = "blue")+ggtitle('UPS Stock Price Forecast') +theme_bw()

The forecast plot illustrates a rising trend in the UPS stock price over the forthcoming 50 days. The colored bands represent the 95% confidence interval. However, the forecast appears overly smooth, lacking noticeable fluctuations. This suggests that the model may not adequately capture the inherent variability or fluctuations in the stock price data.

ARIMA vs. Benchmark Methods

Code
autoplot(canada_ts, xlab = "Date", ylab = "Billions") +
  autolayer(meanf(canada_ts, h=12),
            series="Mean", PI=FALSE) +
  autolayer(naive(canada_ts, h=12),
            series="Naïve", PI=FALSE)+
  autolayer(rwf(canada_ts, h=12, drift=TRUE),
            series="Drift", PI=FALSE)+
   autolayer(pred_canada, 
            series="ARIMA",PI=FALSE) + ggtitle("U.S.-Canada Freight Value ARIMA vs. Benchmarks")+
  guides(colour=guide_legend(title="Forecast")) +theme_bw()

We observe that the ARIMA model best captures the trend, closely aligning with the observed data. The drift methods exhibit a comparable trajectory for the forecast. In contrast, the Mean and Naive benchmark models notably underperform, failing to accurately capture the underlying trend.

Code
# Compute accuracy
output1 <- accuracy(pred_canada)
output2 <- accuracy(meanf(canada_ts, h=12))
output3 <- accuracy(naive(canada_ts, h=12))
output4 <- accuracy(rwf(canada_ts, drift=TRUE, h=12))

output_list <- list(output1, output2, output3, output4)

# Put into df
output_df <- do.call(rbind, output_list)

# Set row names for the dataframe
row.names(output_df) <- c("ARIMA", "Meanf", "Naive", "Drift")

output_df
                 ME     RMSE      MAE        MPE      MAPE      MASE
ARIMA -2.179075e-02 3.105814 2.276102 -0.2721476  4.798624 0.3998011
Meanf -1.819683e-15 7.158385 5.209884 -2.1984228 11.014570 0.9151249
Naive  8.439239e-02 3.747662 2.676507 -0.1592665  5.543406 0.4701329
Drift -2.088947e-15 3.746711 2.677095 -0.3338399  5.547840 0.4702363
             ACF1
ARIMA -0.01487147
Meanf  0.85587296
Naive -0.27042733
Drift -0.27042733

Upon examining the accuracy metrics, it’s evident that the ARIMA model achieved the lowest RMSE, MAE, and MAPE values, followed by the Drift model. This suggests that the ARIMA model outperforms the benchmark models in terms of accuracy.

Code
autoplot(employment_ts, xlab = "Date", ylab = "Employment") +
  autolayer(meanf(employment_ts, h=12),
            series="Mean", PI=FALSE) +
  autolayer(naive(employment_ts, h=12),
            series="Naïve", PI=FALSE)+
  autolayer(rwf(employment_ts, h=12, drift=TRUE),
            series="Drift", PI=FALSE)+
   autolayer(pred_employment, 
            series="ARIMA",PI=FALSE) + ggtitle("U.S. Air Transportation Employment ARIMA vs. Benchmarks")+
  guides(colour=guide_legend(title="Forecast")) +theme_bw()

We observe that the ARIMA, drift and Naive models exhibit a comparable trajectory for the forecast and best captures the trend, closely aligning with the observed data. In contrast, the Mean benchmark model notably underperform, failing to accurately capture the underlying trend.

Code
# Compute accuracy
output1 <- accuracy(pred_employment)
output2 <- accuracy(meanf(employment_ts, h=12))
output3 <- accuracy(naive(employment_ts, h=12))
output4 <- accuracy(rwf(employment_ts, drift=TRUE, h=12))

output_list <- list(output1, output2, output3, output4)

# Put into df
output_df <- do.call(rbind, output_list)

# Set row names for the dataframe
row.names(output_df) <- c("ARIMA", "Meanf", "Naive", "Drift")

output_df
                 ME      RMSE       MAE         MPE      MAPE      MASE
ARIMA  2.633125e+02  6241.208  2809.261  0.05105339 0.6005098 0.1456793
Meanf -2.235212e-11 25101.751 21628.241 -0.28679049 4.5989461 1.1215716
Naive -8.421053e+01  7028.726  3175.439 -0.02909029 0.6871776 0.1646681
Drift -4.059817e-12  7028.221  3184.303 -0.01132370 0.6889589 0.1651278
             ACF1
ARIMA 0.009897766
Meanf 0.957521645
Naive 0.376333238
Drift 0.376333238

Upon examining the accuracy metrics, it’s evident that the ARIMA model achieved the lowest RMSE, MAE, and MAPE values, followed by the Drift model. This suggests that the ARIMA model outperforms the benchmark models in terms of accuracy.

Code
autoplot(tsi_ts, xlab = "Date", ylab = "Transportation Services Index") +
  autolayer(meanf(tsi_ts, h=12),
            series="Mean", PI=FALSE) +
  autolayer(naive(tsi_ts, h=12),
            series="Naïve", PI=FALSE)+
  autolayer(rwf(tsi_ts, h=12, drift=TRUE),
            series="Drift", PI=FALSE)+
   autolayer(pred_tsi, 
            series="ARIMA",PI=FALSE) + ggtitle("U.S. Freight Transportation Services Index ARIMA vs. Benchmarks")+
  guides(colour=guide_legend(title="Forecast")) +theme_bw()

We observe that the ARIMA model best captures the trend, closely aligning with the observed data. The drift methods exhibit a comparable trajectory for the forecast. In contrast, the Mean and Naive benchmark models notably underperform, failing to accurately capture the underlying trend.

Code
# Compute accuracy
output1 <- accuracy(pred_tsi)
output2 <- accuracy(meanf(tsi_ts, h=12))
output3 <- accuracy(naive(tsi_ts, h=12))
output4 <- accuracy(rwf(tsi_ts, drift=TRUE, h=12))

output_list <- list(output1, output2, output3, output4)

# Put into df
output_df <- do.call(rbind, output_list)

# Set row names for the dataframe
row.names(output_df) <- c("ARIMA", "Meanf", "Naive", "Drift")

output_df
                 ME      RMSE       MAE         MPE      MAPE      MASE
ARIMA  7.283871e-04  1.475506  1.068367 -0.01205377 0.9267941 0.2634875
Meanf -1.056836e-14 13.009848 11.112711 -1.22065210 9.5135777 2.7406872
Naive  1.141304e-01  1.500374  1.096014  0.08659501 0.9496963 0.2703060
Drift  2.500415e-15  1.496027  1.087839 -0.01244364 0.9430018 0.2682897
              ACF1
ARIMA  0.004977272
Meanf  0.987698495
Naive -0.064131500
Drift -0.064131500

Upon examining the accuracy metrics, it’s evident that the ARIMA model achieved the lowest RMSE, MAE, and MAPE values, followed by the Drift model. This suggests that the ARIMA model outperforms the benchmark models in terms of accuracy.

Code
autoplot(air_ts, xlab = "Date", ylab = "Freight revenue per ton-mile (current cents)") +
  autolayer(meanf(air_ts, h=5),
            series="Mean", PI=FALSE) +
  autolayer(naive(air_ts, h=5),
            series="Naïve", PI=FALSE)+
  autolayer(rwf(air_ts, h=5, drift=TRUE),
            series="Drift", PI=FALSE)+
   autolayer(pred_air, 
            series="ARIMA",PI=FALSE) + ggtitle("U.S. Domestic Air Carrier Average Freight Revenue ARIMA vs. Benchmarks")+
  guides(colour=guide_legend(title="Forecast")) +theme_bw()

We observe that the ARIMA model seems underperform, failing to accurately capture the underlying trend.

Code
# Compute accuracy
output1 <- accuracy(pred_air)
output2 <- accuracy(meanf(air_ts, h=12))
output3 <- accuracy(naive(air_ts, h=12))
output4 <- accuracy(rwf(air_ts, drift=TRUE, h=12))

output_list <- list(output1, output2, output3, output4)

# Put into df
output_df <- do.call(rbind, output_list)

# Set row names for the dataframe
row.names(output_df) <- c("ARIMA", "Meanf", "Naive", "Drift")

output_df
                 ME     RMSE       MAE          MPE      MAPE      MASE
ARIMA  2.965719e-01 11.79289  8.992931   0.18889448  9.662775 0.8915005
Meanf -2.306236e-15 31.95599 29.965043 -12.98493762 35.854642 2.9705387
Naive  9.865931e-01 14.61164 10.087410  -0.04930048 11.202702 1.0000000
Drift -3.383537e-15 14.57830  9.911415  -1.18896301 10.999729 0.9825529
            ACF1
ARIMA -0.2644170
Meanf  0.8911918
Naive  0.2602701
Drift  0.2602701

Upon examining the accuracy metrics, it’s evident that the ARIMA model achieved the lowest RMSE, MAE, and MAPE values, followed by the Drift model. This suggests that the ARIMA model outperforms the benchmark models in terms of accuracy.

Code
autoplot(ups_ts, xlab = "Date", ylab = "Stock Price") +
  autolayer(meanf(ups_ts, h=50),
            series="Mean", PI=FALSE) +
  autolayer(naive(ups_ts, h=50),
            series="Naïve", PI=FALSE)+
  autolayer(rwf(ups_ts, h=50, drift=TRUE),
            series="Drift", PI=FALSE)+
   autolayer(pred_ups, 
            series="ARIMA",PI=FALSE) + ggtitle("UPS Stock Price ARIMA vs. Benchmarks")+
  guides(colour=guide_legend(title="Forecast")) +theme_bw()

We observe that the ARIMA, drift and Naive models exhibit a comparable trajectory for the forecast and best captures the trend, closely aligning with the observed data. In contrast, the Mean benchmark model notably underperform, failing to accurately capture the underlying trend.

Code
# Compute accuracy
output1 <- accuracy(pred_ups)
output2 <- accuracy(meanf(ups_ts, h=12))
output3 <- accuracy(naive(ups_ts, h=12))
output4 <- accuracy(rwf(ups_ts, drift=TRUE, h=12))

output_list <- list(output1, output2, output3, output4)

# Put into df
output_df <- do.call(rbind, output_list)

# Set row names for the dataframe
row.names(output_df) <- c("ARIMA", "Meanf", "Naive", "Drift")

output_df
                ME      RMSE       MAE          MPE      MAPE       MASE
ARIMA 5.173999e-05  2.336454  1.513771  -0.01543307  1.161839 0.04821098
Meanf 9.806701e-14 40.426672 38.180111 -10.12057196 31.863702 1.21597051
Naive 3.660449e-02  2.337404  1.515061   0.01546450  1.162754 0.04825209
Drift 5.107578e-15  2.337117  1.514580  -0.01549867  1.162442 0.04823674
            ACF1
ARIMA 0.03201866
Meanf 0.99795274
Naive 0.03201881
Drift 0.03201881

Upon examining the accuracy metrics, it’s evident that the ARIMA model achieved the lowest RMSE, MAE, and MAPE values, followed by the Drift model. This suggests that the ARIMA model outperforms the benchmark models in terms of accuracy.

Back to top